Development and validation of a clinical predictive model for severe and critical pediatric COVID-19 infection

Introduction Children infected with COVID-19 are susceptible to severe manifestations. We aimed to develop and validate a predictive model for severe/ critical pediatric COVID-19 infection utilizing routinely available hospital level data to ascertain the likelihood of developing severe manifestations. Methods The predictive model was based on an analysis of registry data from COVID-19 positive patients admitted to five tertiary pediatric hospitals across Asia [Singapore, Malaysia, Indonesia (two centers) and Pakistan]. Independent predictors of severe/critical COVID-19 infection were determined using multivariable logistic regression. A training cohort (n = 802, 70%) was used to develop the prediction model which was then validated in a test cohort (n = 345, 30%). The discriminative ability and performance of this model was assessed by calculating the Area Under the Curve (AUC) and 95% confidence interval (CI) from final Receiver Operating Characteristics Curve (ROC). Results A total of 1147 patients were included in this analysis. In the multivariable model, infant age group, presence of comorbidities, fever, vomiting, seizures and higher absolute neutrophil count were associated with an increased risk of developing severe/critical COVID-19 infection. The presence of coryza at presentation, higher hemoglobin and platelet count were associated with a decreased risk of severe/critical COVID-19 infection. The AUC (95%CI) generated for this model from the training and validation cohort were 0.96 (0.94, 0.98) and 0.92 (0.86, 0.97), respectively. Conclusion This predictive model using clinical history and commonly used laboratory values was valuable in estimating the risk of developing a severe/critical COVID-19 infection in hospitalized children. Further validation is needed to provide more insights into its utility in clinical practice.


Introduction
Since the beginning of COVID-19 pandemic in late 2019, the infection rate and overall disease severity was reported to be low in children [1,2]. Up to January 2022, in the United States, more than 8.5 million children have been infected with an estimated hospitalization rate of 1.7% -4.3% and a low mortality rate of 0.02% [3]. However, reports of children developing severe COVID-19 infection requiring hospitalization and pediatric intensive care unit (PICU) care are increasing worldwide [4][5][6][7]. Children infected with COVID-19 are susceptible to serious manifestations such as acute respiratory distress syndrome (ARDS), shock, stroke, and multisystem inflammatory syndrome in children (MIS-C) [5,[8][9][10][11]. The reported mortality of these severe phenotypes ranges from 3.5% -7% [7,12].
Many countries have approved the use of COVID-19 vaccines for children above 12 years of age [13,14] and clinical trials are ongoing to assess its safety and efficacy in adolescents and even younger children (NCT04796896) [15]. However, until these vaccines are approved and made readily available for younger children, they will continue to be at risk of severe infection. Additionally, infants and children with underlying comorbid conditions have been found to be at increased risk for developing severe manifestations of COVID-19 infection [16][17][18].
In Asia, children may be especially prone to develop severe COVID-19 infection and associated mortality. This susceptibility to severe disease and high mortality rate may be a reflection of socioeconomic factors, cultural factors, hospital admission criteria, management factors and low vaccine coverage [5]. In previous studies from Pakistan and India, children hospitalized with COVID-19 infection or MIS-C had a high mortality (10-20%) [19,20]. Literature also reported various risk factors found to be associated with severe COVID-19 infection and mortality in children; these included: age less than one year, associated comorbid conditions, evidence of acute inflammation and presence of organ dysfunction [21][22][23]. An early predictive model to identify patients who may progress to severe COVID-19 infection can help stratify patients who may benefit from closer monitoring and admission to a higher level of care. In this study, we aimed to develop and validate a predictive model for severe/ critical pediatric COVID-19 infection utilizing routinely available hospital level data.

Study design
We developed and validated a prediction model based on an analysis of registry data of patients admitted to five tertiary pediatric hospitals across Asia [Singapore, Malaysia, Indonesia (two centers) and Pakistan]. These were centers contributing data to the Pediatric Acute and Critical Care COVID-19 Registry of Asia (PACCOVRA), which is a registry (clinicaltrial.gov registration NCT04395781) within the Pediatric Acute and Critical Care Medicine Asian Network (PACCMAN Inclusion criteria for patient data were (1) confirmed COVID-19 infection defined by a positive nasopharyngeal aspirate for COVID-19 nucleic acid reverse transcriptase polymerase chain reaction (RT-PCR); or (2) confirmed MIS-C defined by the Centers for Disease Control and Prevention (CDC) criteria [24]; and (3) admitted to participating hospitals from November 2019 to November 2021. Epidemiological, clinical, laboratory and outcome data were extracted retrospectively at participating sites and anonymized data was entered into a secure centralized database set up using Research Electronic Data Capture system (REDCAP) by the main coordinating center in Singapore [25]. Outcome data was captured upon discharge from the hospital. The primary outcome was severe/ critical COVID-19 infection. Severity of COVID-19 infection was classified into four groups based on the World Health Organization (WHO) definition (mild, moderate, severe and critical) [26]. Secondary outcomes included hospital length of stay, final respiratory related diagnosis, respiratory support, supportive therapies and organ dysfunction. Organ dysfunction was defined by the International Pediatric Sepsis Consensus Conference criteria [27]. The management of patients with COVID-19 infection at each participating site was at the discretion of the managing team and no standardised protocol was utilised. The criteria for admission to intermediate care and intensive care was also at the discretion of the managing team.

Statistical analysis
Primary outcome, COVID-19 infection severity, was categorized as binary data with categories mild/ moderate or severe/ critical infection. All variables were summarized based on COVID-19 infection severity. Categorical and continuous variables were presented as counts (percentages) and median (interquartile range (IQR)), respectively. Chi-Square test and the Mann-Whitney U tests were used to compare categorical and continuous variables, respectively, with respect to COVID-19 infection severity.
The eligible sample (n = 1147) was randomly split into a training (n = 802, 70%) and a validation cohort (n = 345, 30%). Data was randomly split to avoid selection bias for any of the variables in training and validation cohort. The prediction risk model was created using the training cohort. Univariate and multivariable logistic regression model were used to find independent predictors of severe/critical COVID-19 infection. Generalized linear mixed model (GLIMMIX) approach for binary data with site as random effects was used for regression analysis. Covariates considered for inclusion in the model were identified a priori without knowledge of the outcome data based on clinical judgement and potential confounders identified in the univariate logistic regression. Variables with p value < 0.2 in the univariate logistic regression model were chosen for multivariable model. Backward, forward and stepwise variable selection were used to determine final predictors of severe/critical COVID-19 infection. The adjusted β coefficient with standard error (SE) and corresponding odds ratio (OR) with 95% confidence intervals (CI) were reported for each predictor. The model constant and β coefficient for each predictor were used to generate the predicted probability equation. The prediction model was assessed by the calibration plot.
The discriminative ability and performance of the model was assessed by calculating the Area Under the Curve (AUC) from final Receiver Operating Characteristics Curve (ROC). Laboratory tests results were incorporated in order to assist clinicians in identifying patients who may develop severe/critical COVID-19 infections. The logistic regression model yields a score based on linear combination of the selected variables. These scores were also reported for full and reduced set of variables. This score can be converted to an estimated probability of severe/critical COVID-19 infection using the relationship: estimated probability = e score / (1+e score ), where e is the natural exponential. Because laboratory tests (e.g., C-reactive protein, pro-calcitonin) were not mandatory for each center, we expected missing data in these values. For patients who had no admission laboratory data, the first available laboratory data within that admission were used. In our sensitivity analysis, we applied the final multivariable model in two randomly selected sites (KKH and UMMC and again with AKUH and MTMH), and a separate analysis excluding MIS-C patients from both training and validation data to check robustness of the model.
All tests were two sided and statistical significance was set at p value <0.05 unless otherwise stated. Analysis was conducted in R (R Core Team, 2020) and SAS version 9.4 software (SAS Institute; Cary, North Carolina, USA).

Risk score model development and validation
The training and validation dataset were indifferent in all demographic, clinical, laboratory and outcome data, except for a lower lymphocyte count and presenting symptom of diarrhea (Tables 1 and 2). In the multivariable model, infant age group, presence of comorbidities, seizures, vomiting, fever and higher absolute neutrophil count were associated with an increased risk of developing severe/critical COVID-19 infection. The presence of coryza at presentation, hemoglobin and platelet count was associated with a decreased risk severe/critical COVID-19 infection. The severe/critical COVID-19 infection score based on final multivariable model was as follows (Table 3) Table). In the sensitivity analysis, we applied the predictive model to two random sites to ensure its predictive ability was maintained across sites. The AUCs were satisfactory

Discussion
Utilizing early hospital admission data from a multicenter network, we generated a simple clinical predictive model to identify children who may progress to develop severe/critical  Initial studies have shown that children with COVID-19 infection may not demonstrate the same degree of disease severity compared to adults [11]. However, subsequent studies seem to suggest that there is a bimodal severity peak: The first peak occurs in young infants (<3months) and the second in adulthood. [23,[28][29][30]. Similar to these prior studies conducted in United States and United Kingdom, our predictive model identified infants as a group with increased risk of developing severe COVID-19 disease. The exact reason for this susceptibility remains unclear, though maternal vaccination and breastfeeding practices most likely play a role. The maternal IgG humoral response to vaccination (or infection) has been demonstrated to transfer across the placenta into the fetus, conferring protection to the newborn [31]. Hence, cohorts with a low vaccination rate in expectant mothers (or prior to vaccine approval in pregnancy) may result in inferior protection to the newborn. Although COVID-19 antibodies (SARS-CoV-2 spike RBD-specific IgG1, IgA and IgM antibodies) were detected in breast milk, these were absent in the infants' serum [32]. On the other hand, other studies, such as a multicenter multivariable Bayesian modeling study conducted in Spain reported that age <2years to be protective against critical COVID-19 infection [33]. The infant age group was therefore included as a discriminatory factor associated with severe COVID-19 infection in our model.
The presence of complex comorbidities increases the risk of hospitalization and severe COVID-19 disease [21,22]. In particular, in a large cross-sectional pediatric study (n = 43,465), cardiovascular diseases, type I diabetes, obesity and prematurity were shown to be associated with severe COVID-19 infection [34]. This is not surprising given that majority of mortalities due to COVID-19 infection occur in patients with underlying metabolic and cardiovascular disease [35]. The predisposition of adult patients with metabolic and cardiovascular comorbidities to COVID-19 disease is recognized to be associated with their pro-inflammatory and hypercoagulopathic tendencies [36]. Obesity also results in altered respiratory mechanics predisposing to severe respiratory infections [37]. It is plausible that these mechanisms also apply to the pediatric patient. Though our study identified the presence of comorbidities to be associated with 6-fold increased odds of severe COVID-19 disease (Table 3), we were not able to replicate with granularity, the contribution of each type of comorbidity due our smaller sample size. Nevertheless, the frequency of cardiovascular, respiratory, gastrointestinal, hematologyoncology and neurology comorbidities were higher in the severe group (S1 Table). The types of hematology-oncology comorbidity and treatments received at the time of study which could have affected outcomes were also not captured. Symptoms associated with severe COVID-19 infection identified in our study included fever, vomiting and the presence of seizures. Considering that respiratory related COVID-19 disease was more frequent in children [33], it is interesting to note that the presence of non- respiratory symptoms was associated with severe disease. These symptoms potentially indicate systemic involvement. Viral particles have been demonstrated in bronchial and alveolar epithelium, myocardium, intestinal, hepatic, splenic, renal, as well as, brain tissue [38]. Reactive microglia, neuronal ischemia and congestion are among the pathologic findings reported in children with COVID-19 infection who presented with acute encephalopathy and seizures [38]. Our study identifies seizures as a symptom associated with the highest β coefficient for severe disease. It is possible this symptom reflects direct central nervous system (CNS) infection, is part of a systemic syndrome (e.g., shock, cytokine storm, electrolyte imbalance) or part of underlying epilepsy [39]. In contrast, symptoms indicating a mild upper respiratory tract infection (coryza) was evidently protective against severe COVID-19 disease. Our predictive model included routinely available laboratory variables. Neutrophilia was associated development of severe disease, whereas, higher hemoglobin and platelet were protective. Neutrophilia or relative lymphopenia has been shown to be a feature of both severe respiratory disease and MIS-C [18,19,21,40]. Though less studied, low hemoglobin (anemia) has been associated with an increased risk of respiratory failure, ICU admission, mechanical ventilation and death [41,42]. This hemoglobin effect may be a reflection of disease severity, underlying comorbidity, or malnutrition [42]. Studies have associated thrombocytopenia with critical illness and death in adult COVID-19 infections [43,44]. Interestingly, a trend towards less critical illness was also observed in adult patients with high platelet counts [>400x10 9 /L], though this has not been previously demonstrated in children until our study [43,45]. The mechanism responsible for the protective effect of platelets are unclear but postulated to be related to a protective role of platelets towards the lung parenchyma and improved viral clearance [45].
Though predictive models for severe COVID-19 infection in children have been previously proposed [21,40], these were generated from single countries. Our study utilized data from a network of hospitals across Asia including a population of children with diverse socioeconomic, cultural and biological background which may increase its generalizability. We used routinely available demographic, clinical and laboratory data which is likely relevant in most pediatric admissions to generate the model. However, there were several limitations in this study. Firstly, the diagnosis of MIS-C which requires a recent laboratory confirmed SARS--CoV-2 infection within the prior 4 weeks was challenging in regions where routine PCR or serological evidence of previous infection was not available for patients who had mild symptoms/ asymptomatic primary infections. Some cases may have been missed due to this criteria. Conversely, clinical MIS-C features (e.g. fever, gastro-intestinal symptoms, hypotension and high inflammatory markers) may also be present in acute COVID-19 infection [24,46]. As such, even though the immune-pathology associated with sub-acute/post-acute MIS-C is unique from the cytokine storm of acute COVID-19 infection, the clinical presentation may not be easy to differentiate [47]. We excluded all patients who fulfilled MIS-C criteria in our sensitivity analysis to ensure the predictive model performed satisfactorily regardless of this challenging diagnosis. This limitation also precluded analysis to differentiate patients with/ without COVID-19 immunity. Secondly, we were not able to account for circulating variants of concern which were dominant at the different time periods at each of the sites. For example, the Delta variant (B.1.617.2) of SARS-CoV-2 was dominant in Singapore by May 31 st 2021, whereas in Indonesia and Malaysia, it became dominant by July 8 th and 23 rd 2021, respectively [48][49][50]. Moreover, availability of sequencing data in some regions was limited and may be subject to sampling bias [51]. Due to the retrospective nature of data collection, the perceived severity of disease may be biased and there was no standardized reporting, laboratory testing or management protocol. As such, we could not investigate other commonly used laboratory tests such as C-reactive protein, procalcitonin and lactate dehydrogenase in the model [52].
Further external validation is needed to evaluate the performance of this predictive model in the clinical setting and in other geographical regions. Lastly, we did not explore other important outcomes including mortality (due to the small sample size, n = 33) and the cause of mortality reported in this study may or may not be directly due to COVID-19 infection.

Conclusion
In summary, we created a predictive model to identify children who may develop severe/critical COVID-19 infection using routinely available hospital level data. This novel model should be validated further in other settings and potentially useful to hospitalists in helping stratify patients into those may benefit from closer monitoring in a higher level of care.